Black Sea
Tilt Matching for Scalable Sampling and Fine-Tuning
Potaptchik, Peter, Lee, Cheuk-Kit, Albergo, Michael S.
We propose a simple, scalable algorithm for using stochastic interpolants to sample from unnormalized densities and for fine-tuning generative models. The approach, Tilt Matching, arises from a dynamical equation relating the flow matching velocity to one targeting the same distribution tilted by a reward, implicitly solving a stochastic optimal control problem. The new velocity inherits the regularity of stochastic interpolant transports while also being the minimizer of an objective with strictly lower variance than flow matching itself. The update to the velocity field can be interpreted as the sum of all joint cumulants of the stochastic interpolant and copies of the reward, and to first order is their covariance. The algorithms do not require any access to gradients of the reward or backpropagating through trajectories of the flow or diffusion. We empirically verify that the approach is efficient and highly scalable, providing state-of-the-art results on sampling under Lennard-Jones potentials and is competitive on fine-tuning Stable Diffusion, without requiring reward multipliers. It can also be straightforwardly applied to tilting few-step flow map models.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Romania > Black Sea (0.05)
- North America > United States > New York (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Much Ado About Noising: Dispelling the Myths of Generative Robotic Control
Pan, Chaoyi, Anantharaman, Giri, Huang, Nai-Chieh, Jin, Claire, Pfrommer, Daniel, Yuan, Chenyang, Permenter, Frank, Qu, Guannan, Boffi, Nicholas, Shi, Guanya, Simchowitz, Max
Long-horizon, dexterous manipulation tasks such as furniture assembly, food preparation, and manufacturing have been a holy grail in robotics. Recent large robot action models (T eam et al., 2025; Black et al., 2024; Kim et al., 2024) have made substantial breakthroughs towards these goals by imitating expert demonstrations of diverse qualities. We provide a more comprehensive review of related work in Section 6, but highlight here a key trend: while supervised learning from demonstration, also known as behavior cloning (BC), has been applied across domains for decades (Pomerleau, 1988), its recent success in robotic manipulation has coincided with the adoption of what we term generative control policies (GCPs): robotic control policies that use generative modeling architectures, such as diffusion models, flow models, and autoregressive transformers, as parameterizations of the mapping from observation to action. Given the seemingly transformative nature of GCPs for robot learning, there has been much speculation about the origin of their superior performance relative to policies trained with a regression loss, henceforth regression control policies (RCPs). GCPs, by modeling conditional distributions over actions, are uniquely suited to the multi-task pretraining paradigm popular in today's large robotic models.
- Europe > Romania > Black Sea (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence > Robots > Manipulation (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Test-time scaling of diffusions with flow maps
Sabour, Amirmojtaba, Albergo, Michael S., Domingo-Enrich, Carles, Boffi, Nicholas M., Fidler, Sanja, Kreis, Karsten, Vanden-Eijnden, Eric
A common recipe to improve diffusion models at test-time so that samples score highly against a user-specified reward is to introduce the gradient of the reward into the dynamics of the diffusion itself. This procedure is often ill posed, as user-specified rewards are usually only well defined on the data distribution at the end of generation. While common workarounds to this problem are to use a de-noiser to estimate what a sample would have been at the end of generation, we propose a simple solution to this problem by working directly with a flow map. By exploiting a relationship between the flow map and velocity field governing the instantaneous transport, we construct an algorithm, Flow Map Trajectory Tilting (FMTT), which provably performs better ascent on the reward than standard test-time methods involving the gradient of the reward. The approach can be used to either perform exact sampling via importance weighting or principled search that identifies local maximizers of the reward-tilted distribution. We demonstrate the efficacy of our approach against other look-ahead techniques, and show how the flow map enables engagement with complicated reward functions that make possible new forms of image editing, e.g. by interfacing with vision language models. Figure 1: Test-time search can overcome model biases and reliably sample from regions of the distribution (e.g., precise clock times) that baselines fail to capture. Large scale foundation models built out of diffusions (Ho et al., 2020; Song et al., 2020) or flow-based transport (Lipman et al., 2022; Albergo & V anden-Eijnden, 2022; Albergo et al., 2023; Liu In this paradigm, performing generation amounts to numerically solving an ordinary or stochastic differential equation (ODE/SDE), the coefficients of which are learned neural networks.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Romania > Black Sea (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Time Extrapolation with Graph Convolutional Autoencoder and Tensor Train Decomposition
Chen, Yuanhong, Pichi, Federico, Gao, Zhen, Rozza, Gianluigi
Graph autoencoders have gained attention in nonlinear reduced-order modeling of parameterized partial differential equations defined on unstructured grids. Despite they provide a geometrically consistent way of treating complex domains, applying such architectures to parame-terized dynamical systems for temporal prediction beyond the training data, i.e. the extrapolation regime, is still a challenging task due to the simultaneous need of temporal causality and general-izability in the parametric space. In this work, we explore the integration of graph convolutional autoencoders (GCAs) with tensor train (TT) decomposition and Operator Inference (OpInf) to develop a time-consistent reduced-order model. In particular, high-fidelity snapshots are represented as a combination of parametric, spatial, and temporal cores via TT decomposition, while OpInf is used to learn the evolution of the latter. Moreover, we enhance the generalization performance by developing a multi-fidelity two-stages approach in the framework of Deep Operator Networks (DeepONet), treating the spatial and temporal cores as the trunk networks, and the parametric core as the branch network. Numerical results, including heat-conduction, advection-diffusion and vortex-shedding phenomena, demonstrate great performance in effectively learning the dynamic in the extrapolation regime for complex geometries, also in comparison with state-of-the-art approaches e.g.
- North America > United States (0.14)
- Europe > Switzerland (0.04)
- Europe > Romania > Black Sea (0.04)
- (2 more...)
- Overview (1.00)
- Research Report > New Finding (0.67)
Differentiable Physics-Neural Models enable Learning of Non-Markovian Closures for Accelerated Coarse-Grained Physics Simulations
Xue, Tingkai, Ooi, Chin Chun, Ge, Zhengwei, Leong, Fong Yew, Li, Hongying, Kang, Chang Wei
Numerical simulations provide key insights into many physical, real-world problems. However, while these simulations are solved on a full 3D domain, most analysis only require a reduced set of metrics (e.g. plane-level concentrations). This work presents a hybrid physics-neural model that predicts scalar transport in a complex domain orders of magnitude faster than the 3D simulation (from hours to less than 1 min). This end-to-end differentiable framework jointly learns the physical model parameterization (i.e. orthotropic diffusivity) and a non-Markovian neural closure model to capture unresolved, 'coarse-grained' effects, thereby enabling stable, long time horizon rollouts. This proposed model is data-efficient (learning with 26 training data), and can be flexibly extended to an out-of-distribution scenario (with a moving source), achieving a Spearman correlation coefficient of 0.96 at the final simulation time. Overall results show that this differentiable physics-neural framework enables fast, accurate, and generalizable coarse-grained surrogates for physical phenomena.
- North America > United States (0.14)
- Europe > Romania > Black Sea (0.04)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Asia > China (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Romania > Black Sea (0.04)
- (3 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- (4 more...)
- Energy (0.93)
- Health & Medicine (0.67)
- Government > Regional Government (0.46)
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models
Lee, Byung-Kwan, Hachiuma, Ryo, Ro, Yong Man, Wang, Yu-Chiang Frank, Wu, Yueh-Hua
Recent advancements in vision-language models (VLMs) have leveraged large language models (LLMs) to achieve performance on par with closed-source systems like GPT-4V. However, deploying these models in real-world scenarios, particularly on resource-constrained devices, remains challenging due to their substantial computational demands. This has spurred interest in distilling knowledge from large VLMs into smaller, more efficient counterparts. A key challenge arises here from the diversity of VLM architectures, which are built on different LLMs and employ varying token types-differing in vocabulary size, token splits, and token index ordering. To address this challenge of limitation to a specific VLM type, we present Generation after Recalibration (GenRecal), a general-purpose distillation framework for VLMs. GenRecal incorporates a Recalibrator that aligns and adapts feature representations between heterogeneous VLMs, enabling effective knowledge transfer across different types of VLMs. Through extensive experiments on multiple challenging benchmarks, we demonstrate that GenRecal significantly improves baseline performances, eventually outperforming large-scale open- and closed-source VLMs.
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Europe > Romania > Black Sea (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (2 more...)
- Asia > China > Fujian Province > Xiamen (0.04)
- North America > Canada (0.04)
- Europe > Romania > Black Sea (0.04)
- (4 more...)